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Abstract — In this paper, motivated by the setting of white-space 
detection |1|, we present theoretical and empirical results for 
detection of the zero-support £ of x e C p (xi = for i e £) with 
reduced-dimension linear measurements. We propose two low- 
complexity algorithms based on one-step thresholding | 2 | for this 
purpose. The second algorithm is a variant of the first that further 
assumes the presence of group-structure in the target signal |3| x. 
Performance guarantees for both algorithms based on the worst- 
case and average coherence (group coherence) of the measurement 
matrix is presented along with the empirical performance of the 
algorithms. 

Index Terms — Zero-Detection, White-Space Detection, 
Compressed-Sensing, Dimensionality-Reduction, Average 
Coherence, Average Group Coherence. 

I. Introduction 

The principal idea that underlies research in the area of 
big data [4j is that the majority of information in many 
signals of interest is structured and therefore lies in a much 
lower dimensional subset of the ambient signal dimension. 
This idea was first popularized by the theory of Compressed- 
Sensing Q (CS) which demonstrated that a vector x € K p with 
fc-sparse "non-zero" support (||x|| = fc) could be recovered 
with n = 0(fclog(p)) « p non-adaptive linear measurements 
y = Ax, where A e C nxp . The initial results prescribed 
the use of random sensing matrices and signal recovery via 
solving an LP which finds, among all solutions consistent 
with the measurements, the one with minimum l\ norm. The 
advent of CS inspired a large amount of research in areas 
related to dimensionality reduction (DR) with goals spanning: 
exploiting different kinds of structure [6], reduced-dimension 
signal processing |7j, structured sensing matrix design [8|, [9|, 
and efforts at employing its results 1 10 1. 

While much work has been done within the DR framework, 
one area that has remained relatively unexplored is the detection 
of zeros: given measurements found in the standard CS setup, 
we are interested in detecting the support of the entries of x that 
are equal to zero. Philosophically, the goal of finding zeros can 
be interpreted as detecting absence/non-existence. This goal can 
be found in many resource-allocation applications where the 
goal is to cheaply query a system of interest to determine what 
is not being used or not working. One conspicuous example 
where this goal manifests itself is white-space detection fTT) . 
White-space detection is a sub-problem of the efficient spec- 
trum sensing problem whose goal is to more efficiently use 
large swaths of bandwidth by designing spectrum sensors that 
quickly find and opportunistically communicate over unused 
pieces of spectrum. Many research efforts with the aim of 



addressing this problem have been heavily influenced by CS- 
like ideas in recent years. A common strategy is to use the 
sparse-approximation/random sampling machinery and recover 
the entire spectrum (or its support) to determine the location 
of unoccupied channels. This approach, given the goal of 
finding free channels to transmit across, is inefficient in several 
respects. The first is that it solves an estimation problem to what 
is intrinsically a detection problem. While exact knowledge of 
spectrum usage is ideal, it is often sufficient and less costly 
to obtain a large subset of the locations of unused pieces 
of spectrum. In particular, more efficient detection of unused 
pieces of spectrum can become critically important in situations 
where the system is required to quickly adapt, e.g., the support 
is changing rapidly. The second is that spectrum usage exhibits 
group behavior, i.e., use of one portion of spectrum is often 
indicative of activity in other portions of the spectrum. For 
example, the entirety of spectrum is broken up into channels 
and most of a channels bandwidth will likely be active at once. 

In this paper, drawing inspiration from the setting of white- 
space detection, we are concerned with a specific type of 
zero detection problem: detection of a large subset of non- 
zero elements, without requiring complete or exact support/zero 
pattern recovery. An additional goal is to design algorithms that 
have low complexity and that are amenable to use in an adaptive 



setting. In this spirit, we present two algorithms in Sec. II-B|that 



utilize methods and results from work in support detection [2|, 
1 11 1 and group model selection (6). The first algorithm (Alg. [TJ 
is a simple modification of one-step thresholding (OST) |2| and 
the second (Alg. [2J is an extension of OST in the setting of 
group model selection in [3 |. The performance guarantees for 
these algorithms are presented in Sec. [Ill] The proof of the 
guarantees is given in Appendices [A] and|B| The proofs utilize 
the concepts of: average coherence/group coherence (v, v 9 ), 
worst-case coherence/group coherence (^,/i ff ), the statistical 
orthogonality condition (StOC), and the coherence property 
(CP) (2J, |12|. Numerical simulations of the two algorithms 



are presented in Sec. IV and the paper concludes in Sec. [V] 



II. Problem Formulation and Algorithms 

Let x € C p where ||ce|| = fc. Denote the zero-support of 
x with 8 c {1, . . . ,p} and its complement I = £ c : Xi = for 
i e £. The two zero-detection algorithms presented in this paper 
generate estimates of the zero-support (£) for the following two 
measurement models corresponding to the presence/absence of 
group-structure in x. 
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A. Measurement Models 

1 ) Non-group-structure model: The non-group-structure set- 
ting corresponds to the standard model of CS given by 



y 



Ax + w, 



(II. 1) 



where y e C nxl is the measurement vector, A e C nxp (n « p) 
is the measurement matrix with unit-norm columns, x e C p is 
the signal vector (k = ||a;| ), and w ~ Af(0,la 2 ). The zero- 
detection algorithm corresponding to this setting is Alg. [T] 

2) Group-structure model: The group structure model corre- 
sponds to scenarios, such as statistical model selection, where 
the existence of a single entry in x implies the presence of 
other related entries in the true model. In this paper we examine 
situations where there are q groups with each group consisting 
of r entries of x. In this case, we modify model (II. 1 1 to 



y 



i=l 



X,j + W 



+ W, 



(11.2) 



where A; e C"* r is a sub matrix of A, and Xi are the 
coefficients associated with group i. Let the set JC = {1 < i < 
q : Xi + 0} denote the true underlying model with k = \K.\ 
groups that have non-zero coefficients. When discussing group- 
structure, £ will denote the indices of groups that have zero 
coefficients. The zero-detection algorithm corresponding to this 
setting is Alg. [2] 

B. Zero detection algorithms 

Both zero-detection algorithms [T] and [2] generate an estimate 
of zero-support of x (£) by applying the Hermitian transpose 
of the measurement matrix A H to the output measurements 
and retaining the indices of the 6 = \£\ lowest magnitude co- 
efficients. Intuitively, the underlying idea behind this operation 
is similar to that of orthogonal matching pursuit (OMP) (|T3) 
and OST in that it expresses the belief that low correlation of 
the output with the i th column of the measurement matrix (a,) 
is indicative of the fact that xi = 0. 

Algorithm 1 Zero-Detection One-Step Thresholding (ZD-OST) 

1 : Input: measurements y, design matrix A, number of empty 

bands to select 9 
2: Initialization: £ = {0} 

3: Obtain measurements and apply processing matrix: s = 



X y. 

Sort |sj| in ascending magnitude and assign to s. 
Construct set of lowest 8 magnitudes £ = s(l : 6). 
Output £. 



III. Performance Guarantees 

Since we are interested in estimating sets £ that with high 
probability contain subsets of the zero-support, the metrics with 
which we establish performance guarantees for ZD-OST and 
ZD-GroTh are the false-discovery proportion (FDP) 

. \e\e\ 



FDP(£) 



\£\ 



(in.i) 



Algorithm 2 Zero-Detection Group Thresholding (ZD-GroTh) 

1: Input: measurements y, design matrix A, size of the group 
r, number of empty groups to select 9 
Initialization: £ = {0} 

Obtain measurements and apply the following: s, = 



Sort \si\ in ascending magnitude and assign to s. 
Construct set of lowest 9 magnitudes £ = s(l : 9) 
Output £ . 



as well as the probability of error P e = P{£ n £ = 0}. 

A. Performance Guarantees for ZD-OST 

In order to establish performance guarantees for ZD-OST, 
it is necessary to define several quantities and review a few 
concepts central to the main arguments. Let a;( m ) denote the 



largest magnitude non-zero entry of x. Hence 



C U)I 



|a;( 2 )| > . . . > \x(k)\- Define the signal-to-noise ratio (SNR), the 
m th largest-to-average ratio (LAR m ), and the minimum SNR 
(SNR min ) as 

SNR 



E[\\w\\l] 



LAR, 



^(m) I 



Ilk 



SNR,, 



(HI.2) 



respectively. In addition, we define two coherence properties 
of the unit-column norm matrix A: the worst-case coherence 
(Eq. |HI.3| > and the average coherence (Eq. |HI.4| > 

MA) 



,(A) 



max \ap a 

i*i 

1 



p-1 



(HI.3) 
(HI.4) 



We further define the statistical orthogonality condition (StOC). 

Definition 1 (Statistical Orthogonality Condition (Def. 3 (2))). 
Let II = (jri, . . . ,7T p ) be a random permutation of {1, . . . ,p], 
and define IT = (tti, ^k), and H c = (itk+i, ■ ■ ■ , 7r p ) for any 
k e {1, . . . ,p}. Then the n x p normalized design matrix A 
is said to satisfy the (k,e,8)-statistical orthogonality condition 
(StOC) if there exist e,5 e [0, 1) such that the inequalities 



\\(A%A n -I)zl 



A%cA n z\\ 



<e|M 2 



(HI.5) 
(HI.6) 



hold for every fixed z e C k with probability exceeding 1 - S 
with respect to the random permutation II. 

Having established the above conventions, we can now 
present the following theorem for the performance of ZD-OST. 

Theorem 1. Assume that the noise is w ~ CAf(0, ct 2 ), and /1 = 
for some constant /i > 0. Also assume that SNR min > 



V lo gP 
16 log p. 



2 



1) Let e = (y/SNR~ - 4^/\ogp)/(2\/SNR) > 0. When 

= 1, if 

fc<mm|( £0 - 4(2 ; a ' 1) ^ ) 2 ; (l + a)-v|, (HI.7) 
'2/7rp _1 +Ap 1 ~ a , where a = 



for some a > 1, then P e < 

(e -\/^) 2 /(cMo)>l- 
2) If \III. 7j ) holds, then we have that with probability ex- 
ceeding 1 - \j2firp~ 1 - 4p 1 ~ Q 



FDP(£) < (k-m)/e, 



(m.8) 



where m is the largest integer for which the following is 
true: 

' Ciklogp 



LAR, 



(m) 



> max 



-,c 2 A* logp 



nSNR 

with ci = 32r\ c 2 = 800(1 - t)' 1 for some t e (0, 1). 

Remarks: To interpret the results in Theorem 1, ( |III.7| i, since 
SNR m ; n > 161ogp, we can choose = (1 + j)16a 2 \ogp 
for some constant 7 > 1. For this choice, SNR = (k/n)(l + 
7)16 logp, and e = \fnfkW 1 + 7" 1 )/( 2 \/l + 7)- In the high 
SNR regime, 7 -* 00, e -> (l/2)yn/fc, and hence the first 
term in (III.7i tends to ((1/2) ^/ra/fc - 4(2 + aT 1 ) p, ) 2 / v 2 , and 
when nfk> 64(2 + a -1 )/! 2 ,, this is approximately n/(4fci/ 2 ). 
This demonstrates that if ^ is sufficiently small, the first term 



in (III. 7 1 is not binding, which implies that we may not need k 
to be very small relative to n. This is also demonstrated by the 



numerical experiments in Sec. IV that show successful recovery 



of large subsets of zero even in the absence of sparsity. 

B. Performance Guarantee for ZD-GroTh 

In order to present performance guarantees for group thresh- 
olding, we will need to introduce a few additional concepts. 



First, we define the group-structure analogues of Eqs. III.3 



and III.4 the worst-case group coherence and the average group 
coherence as 



1 

9-1 



max i=1 q || £ 



(111.9) 



\A?A< 



In addition, we define the group coherence property 

Definition 2 (The Group Coherence Property (Def. 1 [3])). 

The nx rq measurement matrix A is said to satisfy the group 
coherence property if the following two conditions hold for 
some positive constants and c„: 



v 9 < c v n 9 



r\ogq 



(III. 11) 



\Aogg 

Let to be the i th largest group of non-zero coefficients: 
||cc(i)||2 > ||aJ(2)||2 > ■■■ > ||a;(fc)||2 > 0. The following theorem 
is adapted from (Theorem 1, fT4)): 

Theorem 2. Suppose A satisfies the group coherence property 
with parameters c M and c v . Fix parameter C\ > 2, C2 e (0, 1), 



and define parameters c 3 = [32v2e(2ci-l)]/[(l-c 2 )(ci-l)]. 
Then, under the assumptions c\rk < n, c M < C3 1 , and c„ < 
\fc~\C2Cz, we have that with probability at least 1 - (1 + e 2 )q~ 1 
that FDP(JC) < (k - m)/9, where m is the largest inte- 



ger for which the inequality \\x 



(m)\\2 * C 3 fl y I X I 2 



\fiogq 



2a^/2 log q + r/21og2 holds. The probability is with respect to 
the uniform distribution of the true model K. over all possible 
models. 

IV. Numerical Experiments 

This section experimentally demonstrates the efficacy of ZD- 
OST and ZD-GroTh at obtaining £ containing large subsets of 
the zero support. We demonstrate the performance of ZD-OST 
and ZD-GroTh using both a random Bernoulli matrix and the 
MxM 2 (M = 2 m+1 with m an odd integer) matrix of Kerdock- 
Preparata codes [15| of dimension 16 x 256. The results are 



presented in terms of both FDP (Eq. III. 1 1 and P e as a function 
of the sparsity level of x in the frequency domain k = \\(3\\ = 
\\Fx\\q. The input signal for tests of ZD-OST consisted of a 
superposition of k tones from the DFT grid. 



X£ 



j£Q,k=\£l\ 



nc{-(p/2-l),...,p/2} (IV.l) 



For the tests of ZD-GroTh, the random support consisted of 
randomly choosing k groups of r = 8 tones. Figures [TJ |2j 
and [3] show results for ZD-OST and figures Qa) and|4|b) show 
results for ZD-GroTh, including performance comparisons to 
ZD-OST. Fig. [T] shows the FDP performance of GroTh 



FDP vs. k for Kerdock Codes, n = 16, p=256 




Fig. 1: The FDP of Kerdock codes as a function of k for several values of 
8. Each data point represents the average of 5000 trials. The amplitudes \x^\ 
were uniformly distributed in [1, 1000] and a 2 = 500. Note, in cases where 
9 > \£\ = p — k we plotted the quantity \S n£ \/\£ \ to represent the total fraction 
of the zero-support recovered . 

with respect to k for several 9. Also note, as k becomes 
comparable to p, a large fraction of £ correspond to elements 
of the true zero-support. This would suggest that zero-detection 
would be amenable to use in an adaptive setting that would 
enable high-probability detection of zeros via remeasurement 
of the reduced set £ . This is further evidenced by Fig. [2] 
which shows the P e performance for very low values of 9. 
Although the objectives differ considerably, it is illustrative to 
compare the P e (Fig. [3]) for different types of support recovery 
objectives via OST versus the P e of detecting a single zero 
when 6=1. The P e for zero-detection remains considerably 
lower than its counterparts. While the P e is far worse in the 
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Pe vs. k for Kerdock Codes, n = 16, p = 256 




Fig. 2: P e versus k for different choices of 6 using Kerdock codes with 
dimension 16 x 256. Each point was generated based on 5000 independent 
trials. The simulation conditions were the same as those used in Fie.fTl 

Comparisons of Pe Using Kerdock Codes, n = 16, p = 256 |_| 



— e-Pe of detecting 
— »-Pe of detecting complete support 
-o-Pe of detecting 

» FDP of detecting complete support 
Pe of random sampling 




Fig. 3: A comparison of: the probability of error of detecting one zero P e 
when 0=1, the probability of error of detecting one non-zero (using OST), 
and the probability of error and FDP of detecting the complete support. This 
example demonstrates that it is much easier to detect one zero than to recover 
the complete support, since in many scenarios, all we want is the location of 
"one zero". 

FDP vs. k for Kerdock Codes, n = 16, p = 256 



0.06 
0.04 
0.02 



-e-Non-Group,9 = 8 




-o- Non-Group, 9 = 16 




-•-Non-Group, 9 = 24 




-"- Group, 6 = 1 




Group, 9 = 2 




-»- Group, 9 = 3 











(a) Kerdock, group-thresholding 



FDP vs. k for Bernoulli Matrices, n = 16, p = 256 
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(b) Bernoulli, group-thresholding 
Fig. 4: A comparison of the performance of ZD-OST and ZD-GroTh for: (a) 
kerdock-preparata codes, and (b) the random bernoulli matrix. The comparisons 
are made for the same overall number of tones which is why the 9 in the non- 
group data points are 8 times the corresponding value of the Group data points. 



regime of high k, in terms of applications like white-space 
detection, partial support recovery may not be as useful as 
partial zero-support recovery. Figure [4] illustrates that in the 
presence of group-structure, The FDP and P e performance of 
ZD-GroTh considerably outperforms ZD-OST. We also point 
out that the structured Kerdock-Preparata Codes also display 
superior performance to the random bernoulli matrix. 
V. Conclusion 

In this paper, motivated by the setting of white-space detec- 
tion, we investigated using reduced-dimension measurements of 
a target signal x to detect large subsets of its zero-support. Two 
algorithms, ZD-OST/ZD-GroTh, based on OST were presented 
to detect zeros in both the situation where group-structure is 
present and absent in x. Performance guarantees in terms of 
the probability of error and the FDP were proven in terms of 
the measurement matrix properties of worst and average co- 
herence (group coherence). The performance of the algorithms 
was investigated empiricially using both measurement matrices 
based on random bernoulli and deterministic Kerdock-Preparata 
matrices. The numerical experiments demonstrated that a high 
proportion of the detected zero-support sets (£ ) of even small 
cardinality (0 « p) were elements of the true £. We also note 
that even in regimes where the non-zero support is a large 
fraction of the signal dimension k ~ 0.8p, that a substantial 
fraction of £ contained elements of £. We leave for future work 
extending our theory to cover the case of large k. Finally, we 
further point out that even in situations where detecting zeros is 
not the direct goal, efficient methods for finding zeros could still 
make considerable impact if they are incorporated into other 
recovery algorithms. For example, if methods for finding zeros 
are efficient and reliable, they could be used to improve the 
speed and cost of computation by reducing the search space 
through quick determination of additional constraints in other 
recovery algorithms. 

Appendix A 
Proof of Theorem 1 



Proof: Let r = 2a\/logp. Define Q = {max; 



i=l 



||a ff u;|| 



t}. We can show that Q occurs with probability at least 1 - 
prove the first part of Theorem 1 , note 
that when Q occurs and StOC is satisfied, 



min|afy| 

zee 



mm 



i 



+ afw\ 



< mm 



3 

< e\\x\\ 2 + t. 
On the other hand, when Q occurs and StOC: 



(A.l) 



min |aj y\ = min 



EH 
a\ a 

3*i 



> mm \Xi 



- max | Oj djXj 
e || cc || 2 - T. 



(A.2) 
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Hence, when 



We also have 



mm 1£ £ a; 



\x min \ > 2e\\x\\ 2 + 2r, (A.3) 

't/l < miiij E i la^t/l. This shows that under Q and 
StOC, if flA3} is satisfies, then for = 1, £ e £. 

In 0, it is shown that an n x p design matrix sat- 

[0,1) with S < 



p), it is shown that an n x p 
isfies (k, e, 5)-StOC for any e e 



4pexp{- 



{t-Vkv) 2 



r } for a > 1, fc < min{e 2 ^ 2 , (1 + a) V}- 



16(2+a- 1 )V : 

Next we can choose proper parameters such that StOC. Substi- 
tute p = ^ /\/logp, we have that expl- J^^y } = p~ Q , 

where a = (e - \/kv) 2 / (cul) , where c = 16(2 + aT 1 ) 2 . We 
want a > 1 so that the bounds on probability of StOC is tight, 
which is satisfied when k < (e - ^/c/io) 2 /^ 2 < e 2 /^ 2 . Hence 
for these choice of parameters, we have that S < 4p 1_Q , a > 1, 
when k < min{(e - \/cp,o) 2 / v 2 , (1 + a) _1 p}, for a constant 
a > 1. We want to ch oose the largest e possible to make this 
bound tight, and from (A.3 i, for r = 2cr\/\ogp, the largest such 
e ~ (l^-min |-2r)/(2|a:|) = (VSNR min -VIo^)/(2x/SNR). 

Combine the results above, we have that ¥{£ e £} > 
¥{G n StOC} > (1 - v / 2/^p- 1 (logp)- 1 / 2 )(l - 5) > 1 - 
\j2lirp~ 1 (logp) -1 / 2 - 4p 1 ~ Q . Thus the proof is finished by 
writing P e < 1 - ¥{£ e £} < sjzf^p- 1 + Ap 1 ' 01 . 

To prove the second part, notice that for i e I, similar to 
(|A.4k, we have that when Q occurs and under StOC 



\afy\ > \xi 



Hence if 



\xi\ > 2e || || 2 + 2t, 



(A.4) 
(A.5) 

and hence i i £. 



for i el, we have that |af*i/| > maxj e £ |a?y| 
Suppose m is the largest integer for which the following is 
true: |x( m )| > 2e || as || 2 + 2r. Let au\ correspond to the column 
of correspond to x^y Hence |a^y| > maxj 6 f |a^2/| for i = 
l,...,m, m < k. Hence the number of components that are 
incorrectly detected can be at most k - m. Hence, we have 

FDP(£) < (k-m)/9, (A.6) 

when Q occurs and StOC occurs. Finally, the theorem can be 
proved by noting that |x( m )| > 2e || a3 1| 2 + 2r is equivalent to 
|x( m )| > 2e||£c|| 2 /i and |x( m )| > 2t/(1 - 1) for t e (0,1). As 
shown above, the probability that both Q and StOC occurs is 
at least 1 - ^2/irp~ 1 - 4p 1 ~ a . This finishes the proof. ■ 

Appendix B 
Proof of Theorem 2 

Let X/c denote the nxrk sub-matrix of X that corresponds 
to the non-zero blocks, xjc denote the rk x 1 sub-vector of x. 
Define K, = {i e K. ■ \xi\2 ^ C3/^ 9 1| cc || 2\/log . Then we have 



mm 



\xfyh 



min \\xi + (Xf X K x K - x t ) + Xfw\\ 

X i)h 



> min ||a;i||2 - max \{Xf X^xjc 



Hl)\ 



\{X%X k -I)xk\ 



■ max I XiW\\2 

ieK. 

+ max || -X"jio| 

ieK 



(B.l) 



max \\Xfy\\ 2 < max \\XfX K x K \\ 2 + max ||Xf 

2e/C c ze/C c ?e/C c 



(B.2) 



Hence, \\x (L) \\ 2 > \\(X%X K - I)jck|2,oo + 
maxi 6 yc c \\Xf XjcXk I2 + max' =1 ||Xjti;|| is a sufficient 
condition for min- ^ \\X i j/|| 2 > maxi 6 K: c 11-^^2/112- Define 
Q = {max« =1 \\xfw\\ 2 < t}. Note that ||Xf wg is a X 2 
random variable with r degrees of freedom. Using Chernoff 
bound, we have P{|Xfto|| 2 > r} < e" tT2/<T2 (l - 2ty r l 2 , 
for t e (0,1/2^. Choose t = 1/4, we have the lower 
bound: e~ T ^ 4<T ^2 r ^ 2 . From Sidak's Lemma, we have 



IX? w 



-} < 1 



qe 



•^/(4^) 2 r/2_ Let 



r = (2<T\/21ogg + r/21og2). This demonstrate that 
max| =1 HX^u;! 2 < r with r define above occurs with 
probability of at least 1 - q^ 1 . Combine this noise bound 
with [Proof of Theorem 1 in (3)], we have that condition 
for correct detection occurs with probability of at least 
(l- g -i)(l-e2 g -i) = i + ( e a + i) g -i +0 (qri). 
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